98 research outputs found
Algorithm selection on data streams
We explore the possibilities of meta-learning on data streams, in particular algorithm selection. In a first experiment we calculate the characteristics of a small sample of a data stream, and try to predict which classifier performs best on the entire stream. This yields promising results and interesting patterns. In a second experiment, we build a meta-classifier that predicts, based on measurable data characteristics in a window of the data stream, the best classifier for the next window. The results show that this meta-algorithm is very competitive with state of the art ensembles, such as OzaBag, OzaBoost and Leveraged Bagging. The results of all experiments are made publicly available in an online experiment database, for the purpose of verifiability, reproducibility and generalizability
Towards Meta-learning over Data Streams
Modern society produces vast streams of data. Many stream mining algorithms have been developed to capture general trends in these streams, and make predictions for future observations, but relatively little is known about which algorithms perform particularly well on which kinds of data. Moreover, it is possible that the characteristics of the data change over time, and thus that a different algorithm should be recommended at various points in time. Figure 1 illustrates this. As such, we are dealing with the Algorithm Selection Problem [9] in a data stream setting. Based on measurable meta-features from a window of observations from a data stream, a meta-algorithm is built that predicts the best classifier for the next window. Our results show that this meta-algorithm is competitive with state-of-the art data streaming ensembles, such as OzaBag [6], OzaBoost [6] and Leveraged Bagging [3]
OpenML: networked science in machine learning
Many sciences have made significant breakthroughs by adopting online tools
that help organize, structure and mine information that is too detailed to be
printed in journals. In this paper, we introduce OpenML, a place for machine
learning researchers to share and organize data in fine detail, so that they
can work more effectively, be more visible, and collaborate with others to
tackle harder problems. We discuss how OpenML relates to other examples of
networked science and what benefits it brings for machine learning research,
individual scientists, as well as students and practitioners.Comment: 12 pages, 10 figure
Learning Multiple Defaults for Machine Learning Algorithms
The performance of modern machine learning methods highly depends on their
hyperparameter configurations. One simple way of selecting a configuration is
to use default settings, often proposed along with the publication and
implementation of a new algorithm. Those default values are usually chosen in
an ad-hoc manner to work good enough on a wide variety of datasets. To address
this problem, different automatic hyperparameter configuration algorithms have
been proposed, which select an optimal configuration per dataset. This
principled approach usually improves performance, but adds additional
algorithmic complexity and computational costs to the training procedure. As an
alternative to this, we propose learning a set of complementary default values
from a large database of prior empirical results. Selecting an appropriate
configuration on a new dataset then requires only a simple, efficient and
embarrassingly parallel search over this set. We demonstrate the effectiveness
and efficiency of the approach we propose in comparison to random search and
Bayesian Optimization
A Survey of Deep Meta-Learning
Deep neural networks can achieve great successes when presented with large
data sets and sufficient computational resources. However, their ability to
learn new concepts quickly is quite limited. Meta-learning is one approach to
address this issue, by enabling the network to learn how to learn. The exciting
field of Deep Meta-Learning advances at great speed, but lacks a unified,
insightful overview of current techniques. This work presents just that. After
providing the reader with a theoretical foundation, we investigate and
summarize key methods, which are categorized into i) metric-, ii) model-, and
iii) optimization-based techniques. In addition, we identify the main open
challenges, such as performance evaluations on heterogeneous benchmarks, and
reduction of the computational costs of meta-learning.Comment: Extended version of book chapter in 'Metalearning: Applications to
Automated Machine Learning and Data Mining' (2nd edition, forthcoming
Subspace Adaptation Prior for Few-Shot Learning
Gradient-based meta-learning techniques aim to distill useful prior knowledge
from a set of training tasks such that new tasks can be learned more
efficiently with gradient descent. While these methods have achieved successes
in various scenarios, they commonly adapt all parameters of trainable layers
when learning new tasks. This neglects potentially more efficient learning
strategies for a given task distribution and may be susceptible to overfitting,
especially in few-shot learning where tasks must be learned from a limited
number of examples. To address these issues, we propose Subspace Adaptation
Prior (SAP), a novel gradient-based meta-learning algorithm that jointly learns
good initialization parameters (prior knowledge) and layer-wise parameter
subspaces in the form of operation subsets that should be adaptable. In this
way, SAP can learn which operation subsets to adjust with gradient descent
based on the underlying task distribution, simultaneously decreasing the risk
of overfitting when learning new tasks. We demonstrate that this ability is
helpful as SAP yields superior or competitive performance in few-shot image
classification settings (gains between 0.1% and 3.9% in accuracy). Analysis of
the learned subspaces demonstrates that low-dimensional operations often yield
high activation strengths, indicating that they may be important for achieving
good few-shot learning performance. For reproducibility purposes, we publish
all our research code publicly.Comment: Accepted at Machine Learning Journal, Special Issue of the ECML PKDD
2023 Journal Trac
Understanding Transfer Learning and Gradient-Based Meta-Learning Techniques
Deep neural networks can yield good performance on various tasks but often
require large amounts of data to train them. Meta-learning received
considerable attention as one approach to improve the generalization of these
networks from a limited amount of data. Whilst meta-learning techniques have
been observed to be successful at this in various scenarios, recent results
suggest that when evaluated on tasks from a different data distribution than
the one used for training, a baseline that simply finetunes a pre-trained
network may be more effective than more complicated meta-learning techniques
such as MAML, which is one of the most popular meta-learning techniques. This
is surprising as the learning behaviour of MAML mimics that of finetuning: both
rely on re-using learned features. We investigate the observed performance
differences between finetuning, MAML, and another meta-learning technique
called Reptile, and show that MAML and Reptile specialize for fast adaptation
in low-data regimes of similar data distribution as the one used for training.
Our findings show that both the output layer and the noisy training conditions
induced by data scarcity play important roles in facilitating this
specialization for MAML. Lastly, we show that the pre-trained features as
obtained by the finetuning baseline are more diverse and discriminative than
those learned by MAML and Reptile. Due to this lack of diversity and
distribution specialization, MAML and Reptile may fail to generalize to
out-of-distribution tasks whereas finetuning can fall back on the diversity of
the learned features.Comment: Accepted at Machine Learning Journal, Special Issue on Discovery
Science 202
- …